Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read in parameter PMFs #16

Merged
merged 16 commits into from
Sep 4, 2024
Merged

Read in parameter PMFs #16

merged 16 commits into from
Sep 4, 2024

Conversation

zsusswein
Copy link
Collaborator

@zsusswein zsusswein commented Aug 30, 2024

For generation interval, delays, and right-truncation from a local file.

It assumes that:

It relaxes the assumptions that:

  • The PMFs are coming from the same file
  • The PMFs must be present (can skip delays and right-truncation but
    not GI)
  • The files are in Azure or have a specific name/path

Unit tests cover successfully reading all the parameters individually
as well as in combination. They also check for failure in the expected
places for desired failure modes.

Switching over to this schema from manual CSVs now is a bit of a choice.
I took three cracks at this PR before landing on doing it this way. I
think it provides a couple of important benefits to make the switch now
and that comes through in designing the code here:

  1. Our existing CSV-based approach produces 3 distinct files with
    close-ish schemas, but not an exact match. It requires distinct
    reader functions and substantially more code than reading from a
    file with a unified schema like I do here.
  2. I think this approach helps avoid issues like this week's production
    mishap. It allows us to mix and match files as needed, and we can
    point to a single drop-in file for testing if desired. We don't need
    to fiddle with the production environment.
  3. I think we're going to need to make a switch on the parameter
    approach sooner or later. I wanted to bite the bullet and get it
    over with all at once. It's convenient that it helps make the code
    simpler, but I think making changes swiftly rather than dragging
    things out provides its own benefit too.

@zsusswein zsusswein force-pushed the zs-read-params-take-2 branch 2 times, most recently from 0b054f6 to b637fc9 Compare August 30, 2024 19:15
Copy link

codecov bot commented Aug 30, 2024

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

Thanks for integrating Codecov - We've got you covered ☂️

@zsusswein zsusswein force-pushed the zs-read-params-take-2 branch 2 times, most recently from 15712d6 to fa6acaa Compare August 31, 2024 00:46
@zsusswein zsusswein changed the title Read in delay and GI PMFs Read in parameter PMFs Aug 31, 2024
R/parameters.R Outdated Show resolved Hide resolved
R/parameters.R Outdated Show resolved Hide resolved
@zsusswein zsusswein force-pushed the zs-read-params-take-2 branch 2 times, most recently from d1920ed to 986fdc3 Compare August 31, 2024 15:01
@zsusswein zsusswein marked this pull request as ready for review August 31, 2024 15:07
Copy link
Collaborator

@natemcintosh natemcintosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good. There's maybe a few things we could do to make it even more strict, but I don't think they're necessary; it's already pretty strict

R/parameters.R Show resolved Hide resolved
R/parameters.R Show resolved Hide resolved
R/parameters.R Show resolved Hide resolved
R/parameters.R Outdated Show resolved Hide resolved
tests/testthat/test-parameters.R Outdated Show resolved Hide resolved
R/parameters.R Outdated Show resolved Hide resolved
tests/testthat/test-parameters.R Show resolved Hide resolved
tests/testthat/test-parameters.R Show resolved Hide resolved
NEWS.md Outdated Show resolved Hide resolved
NAMESPACE Outdated Show resolved Hide resolved
R/parameters.R Outdated Show resolved Hide resolved
R/parameters.R Outdated Show resolved Hide resolved
For generation interval, delays, and right-truncation.

It assumes that:
- We want a PMF
- The PMF is coming from a file with schema specified as in
  cdcent/cfa-parameter-estimates#9
- The parameter names and disease names are following a specified schema
- The file is in parquet format
- The PMFs are actually proper PMFs

It relaxes the assumptions that:
- The PMFs are coming from the same file
- The PMFs must be present (can skip delays and right-truncation but
  not GI)
- The files are in Azure or have a specific name/path

Unit tests cover successfully reading all the parameters individually
as well as in combination. They also check for failure in the expected
places for desired failure modes.

Switching over to this schema from manual CSVs now is a bit of a choice.
I took three cracks at this PR before landing on doing it this way. I
think it provides a couple of important benefits to make the switch now
and that comes through in designing the code here:
1. Our existing CSV-based approach produces 3 distinct files with
   close-ish schemas, but not an exact match. It requires distinct
   reader functions and substantially more code than reading from a
   file with a unified schema like I do here.
2. I think this approach helps avoid issues like this week's production
   mishap. It allows us to mix and match files as needed, and we can
   point to a single drop-in file for testing if desired. We don't need
   to fiddle with the production environment.
3. I think we're going to need to make a switch on the parameter
   approach sooner or later. I wanted to bite the bullet and get it
   over with all at once. It's convenient that it helps make the code
   simpler, but I think making changes swiftly rather than dragging
   things out provides its own benefit too.
@zsusswein zsusswein force-pushed the zs-read-params-take-2 branch from cff6f42 to 6a785b3 Compare September 3, 2024 20:55
For additional specificity linking expected errors to tests. I use the
regexp option instead of classed errors for `arg_match()` because I had
trouble catching those even when matching the class exactly.
@zsusswein

This comment was marked as resolved.

Copy link
Collaborator

@kgostic kgostic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some fussy comments, but big-picture this looks good.

R/parameters.R Outdated Show resolved Hide resolved
R/parameters.R Show resolved Hide resolved
R/parameters.R Outdated Show resolved Hide resolved
R/parameters.R Show resolved Hide resolved
R/parameters.R Show resolved Hide resolved
R/parameters.R Outdated Show resolved Hide resolved
R/parameters.R Show resolved Hide resolved
R/parameters.R Show resolved Hide resolved
R/parameters.R Show resolved Hide resolved
But I left state in the documentation for clarity because we don't
currently do anything at the sub-state level
@zsusswein

This comment was marked as resolved.

@zsusswein zsusswein requested a review from athowes September 4, 2024 13:36
Copy link
Collaborator

@athowes athowes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me :)

@zsusswein zsusswein merged commit 9152bc0 into main Sep 4, 2024
8 checks passed
@zsusswein zsusswein deleted the zs-read-params-take-2 branch September 4, 2024 13:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants